The genome sequence of the black-footed limpet, Patella depressa (Pennant, 1777)

We present a genome assembly from an individual Patella depressa (the black-footed limpet; Mollusca; Gastropoda; Patellogastropoda; Patellidae). The genome sequence is 683.7 megabases in span. Most of the assembly is scaffolded into 9 chromosomal pseudomolecules. Gene annotation of this assembly on Ensembl identified 20,502 protein coding genes.


Background
Patella depressa (Pennant, 1777) (in the past called P. intermedia Murray in Knapp, 1857 -junior synonym) has a confused taxonomy, although P. depressa has become the widely accepted name (https://www.marinespecies.org/aphia.php?p=taxdetails&id=151374).P. depressa occurs from Senegal in north Africa and has been found on Anglesey in north Wales, but the current leading range edge is Porth Oer in north Wales.It is absent from Ireland as it did not cross the Irish Sea at the end of the last Ice Age.In the English Channel breeding populations occur as far east as the Isle of Wight and the Cotenin Peninsula.It occurs primarily on moderately exposed and exposed shores, rarely being found in seaweed-dominated sheltered shores in northern France and the British Isles.It is the dominant mid-and high-shore limpet from north Africa to southern Brittany, giving way to Patella ulysipponensis lower down the shore and in rockpools.In response to climate warming, it has become the dominant limpet in the mid and upper reaches of shores in southwest England, and isolated individuals have been found east of the Isle of Wight.The northern leading range edge in North Wales retracted from Anglesey south to the Lleyn Peninsula during the warm spell of the 1960s to mid 1980s (Kendall et al., 2004) and recolonisation beyond the Llyen has not occurred to date.It is absent from the Mediterranean and gives way to its sister species Patella caerulea to the east of the Alboran Front.
Patella depressa is capable of being a multiple brooder (Ribeiro et al., 2009).This pattern contrasts to only developing one brood in the cooler 1940s (Orton & Southward, 1961).In response to recent warming P. depressa now has multiple broods in southwest England (Moore et al., 2011) with an extended breeding season from March to October.Further south in Portugal it is reproductively active for much of the year, with a short resting season in the summer (Ribeiro et al., 2009).P. depressa does not exhibit protandry, having equal proportions of males and females (Borges et al., 2015;Orton & Southward, 1961).P. depressa has a shorter larval life (Ribiero, 2009) than P. vulgata, and can be reared without feeding, suggesting at least partial lecitrophy.This may restrict potential dispersal, explaining the lack of colonization of Ireland and allopatric speciation from Patella caerulea due to separation of Atlantic and Mediterranean populations.P. depressa larvae settle in crustose coralline algae covered shallow rockpools, emerging from these nursery grounds after 1 to 3 years (Bowman, 1981;Seabra et al., 2020).Adults home, but unlike P. vulgata do not aggregate under fucoid clumps (Moore et al., 2007a).P. depressa grazing may be less efficient in controlling fucoid germlings than P. vulgata (Moore et al., 2007b).Exclusion experiments throughout Europe show P. depressa prevents algal colonisation in northern Spain plus central and southern Portugal (Coleman et al., 2006) where it is usually the only species of limpet present (Boaventura et al., 2002b).Intense intraspecific competition has been shown in P. depressa in Portugal (Boaventura et al., 2002a), especially between large and small size classes (Boaventura et al., 2003).

Genome sequence report
The genome was sequenced from one Patella depressa (Figure 1) collected from Godrevy, Cornwall, UK (50.24,.A total of 37-fold coverage in Pacific Biosciences singlemolecule HiFi long reads and 93-fold coverage in 10X Genomics read clouds were generated.Primary assembly contigs were scaffolded with chromosome conformation Hi-C data.Manual assembly curation corrected 101 missing joins or mis-joins and removed 2 haplotypic duplications, reducing the scaffold number by 43.55%, and increasing the scaffold N50 by 52.61%. The final assembly has a total length of 683.7 Mb in 35 sequence scaffolds with a scaffold N50 of 77.5 Mb (Table 1).The snailplot in Figure 2 provides a summary of the assembly statistics, while the distribution of assembly scaffolds on GC proportion and coverage is shown in Figure 3.The cumulative assembly plot in Figure 4 shows curves for subsets of scaffolds assigned to different phyla.Most (95.59%) of the assembly sequence was assigned to 9 chromosomallevel scaffolds.Chromosome-scale scaffolds confirmed by the Hi-C data are named in order of size (Figure 5; Table 2).While not fully phased, the assembly deposited is of one haplotype.Contigs corresponding to the second haplotype have also been deposited.The mitochondrial genome was also assembled and can be found as a contig within the multifasta file of the genome submission.

Sample acquisition and nucleic acid extraction
A Patella depressa (specimen ID MBA-200706-002A, ToLID xgPatDepr1) was collected by hand from Godrevy, Cornwall, UK (latitude 50.24, longitude -5.40) on 2020-07-06.The specimen was collected by Nova Mieszkowska and Rob Mrowicki (Marine Biological Association) and identified by Nova Mieszkowska, and then preserved in liquid nitrogen.using solid-phase reversible immobilisation (SPRI) (Strickland et al., 2023a).In brief, the method employs a 1.8X ratio of AMPure PB beads to sample to eliminate shorter fragments and concentrate the DNA.The concentration of the sheared and purified DNA was assessed using a Nanodrop spectrophotometer and Qubit Fluorometer and Qubit dsDNA High Sensitivity Assay kit.Fragment size distribution was evaluated by running the sample on the FemtoPulse system.
RNA was extracted from xgPatDepr1 in the Tree of Life Laboratory at the WSI using the RNA Extraction: Automated MagMax™ mirVana protocol (do Amaral et al., 2023).The RNA concentration was assessed using a Nanodrop spectrophotometer and Qubit Fluorometer using the Qubit RNA Broad-Range (BR) Assay kit.Analysis of the integrity of the RNA was done using the Agilent RNA 6000 Pico Kit and Eukaryotic Total RNA assay.Protocols developed by the WSI Tree of Life laboratory are publicly available on protocols.io(Denton et al., 2023).A Hi-C map for the final assembly was produced using bwa-mem2 (Vasimuddin et al., 2019) in the Cooler file format (Abdennur & Mirny, 2020).To assess the assembly metrics, the k-mer completeness and QV consensus quality values were calculated in Merqury (Rhie et al., 2020).This work was done using Nextflow (Di Tommaso et al., 2017) DSL2 pipelines "sanger-tol/readmapping" (Surana et al., 2023a) and "sanger-tol/ genomenote" (Surana et al., 2023b).The genome was analysed within the BlobToolKit environment (Challis et al., 2020) and BUSCO scores (Manni et al., 2021;Simão et al., 2015) were calculated.

Version
Table 3 contains a list of relevant software tool versions and sources.

Genome annotation
The Ensembl gene annotation system (Aken et al., 2016) was used to generate annotation for the Patella depressa assembly (GCA_948474765.1).Annotation was created primarily through alignment of transcriptomic data to the genome, with gap filling via protein-to-genome alignments of a select set of proteins from UniProt (UniProt Consortium, 2019).

Wellcome Sanger Institute -Legal and Governance
The materials that have contributed to this genome note have been supplied by a Darwin Tree of Life Partner.The submission of materials by a Darwin Tree of Life Partner is subject to the 'Darwin Tree of Life Project Sampling Code of Practice', which can be found in full on the Darwin Tree of Life website here.By agreeing with and signing up to the Sampling Code of Practice, the Darwin Tree of Life Partner agrees they will meet the legal and ethical requirements and standards set out within this document in respect of all samples acquired for, and supplied to, the Darwin Tree of Life Project.
Further, the Wellcome Sanger Institute employs a process whereby due diligence is carried out proportionate to the nature of the materials themselves, and the circumstances under which they have been/are to be collected and provided for use.
The purpose of this is to address and mitigate any potential legal and/or ethical implications of receipt and use of the materials as part of the research project, and to ensure that in doing so we align with best practice wherever possible.
The overarching areas of consideration are: • Ethical review of provenance and sourcing of the material

Tim Regan
University of Edinburgh, Edinburgh, UK This is a clear and concise genome report for an important rocky shore species.The data availability along with the methods used are well presented.Context for the ecological importance of this species along with its natural range were nicely presented in the Background introduction.I just noticed a few sentences in this section which could be rewritten to avoid ambiguity: "It is absent from Ireland as it did not cross the Irish Sea at the end of the last Ice Age" I presume this is conjecture rather than established fact?
"Patella depressa is capable of being a multiple brooder (Ribeiro et al., 2009).This pattern contrasts to only developing one brood in the cooler 1940s (Orton & Southward, 1961)." in the 1940s, which were cooler?"P.depressa has a shorter larval life (Ribiero, 2009) than P. vulgata, and can be reared without feeding, suggesting at least partial lecitrophy."Were these anoxic growing conditions?Could another possibility could be that the larvae feed on microbes during these early stages?
Is the rationale for creating the dataset(s) clearly described?Reviewer Expertise: phylogenomics, comparative genomics, transcriptomics, gastropods I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
The various protocols are appropriate and clearly described, with helpful links provided to supplement the descriptions provided.The work is certainly technically sound.
The materials and methods descriptions meet accepted standards for these data notes, and the datasets such as the assemblies of the two haplotypes are accessible and useable.
I spotted just a single typo.In Background, second paragraph: I think "lecitrophy" should be changed to "lecithotrophy".

Is the rationale for creating the dataset(s) clearly described? Yes
Are the protocols appropriate and is the work technically sound?Yes Are sufficient details of methods and materials provided to allow replication by others?Yes Are the datasets clearly presented in a useable and accessible format?Yes Competing Interests: No competing interests were disclosed.
I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Figure 2 .
Figure 2. Genome assembly of Patella depressa, xgPatDepr1.1:metrics.The BlobToolKit Snailplot shows N50 metrics and BUSCO gene completeness.The main plot is divided into 1,000 size-ordered bins around the circumference with each bin representing 0.1% of the 683,711,263 bp assembly.The distribution of scaffold lengths is shown in dark grey with the plot radius scaled to the longest scaffold present in the assembly (90,014,739 bp, shown in red).Orange and pale-orange arcs show the N50 and N90 scaffold lengths (77,532,951 and 48,282,134 bp), respectively.The pale grey spiral shows the cumulative scaffold count on a log scale with white scale lines showing successive orders of magnitude.The blue and pale-blue area around the outside of the plot shows the distribution of GC, AT and N percentages in the same bins as the inner plot.A summary of complete, fragmented, duplicated and missing BUSCO genes in the mollusca_odb10 set is shown in the top right.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/Patella%20depressa/dataset/CAOLEZ01/snail.

Figure 3 .
Figure 3. Genome assembly of Patella depressa, xgPatDepr1.1:BlobToolKit GC-coverage plot.Scaffolds are coloured by phylum.Circles are sized in proportion to scaffold length.Histograms show the distribution of scaffold length sum along each axis.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/Patella%20depressa/dataset/CAOLEZ01/blob.

Figure 4 .
Figure 4. Genome assembly of Patella depressa, xgPatDepr1.1:BlobToolKit cumulative sequence plot.The grey line shows cumulative length for all scaffolds.Coloured lines show cumulative lengths of scaffolds assigned to each phylum using the buscogenes taxrule.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/Patella%20depressa/dataset/CAOLEZ01/cumulative.

Figure 5 .
Figure 5. Genome assembly of Patella depressa, xgPatDepr1.1:Hi-C contact map of the xgPatDepr1.1 assembly, visualised using HiGlass.Chromosomes are shown in order of size from left to right and top to bottom.An interactive version of this figure may be viewed at https://genome-note-higlass.tol.sanger.ac.uk/l/?d=R0pa8NFgRziHwpvHMaztxg.
Pacific Biosciences HiFi circular consensus and 10X Genomics read cloud DNA sequencing libraries were constructed according to the manufacturers' instructions.Poly(A) RNA-Seq libraries were constructed using the NEB Ultra II RNA Library Prep kit.DNA and RNA sequencing was performed by the Scientific Operations core at the WSI on Pacific Biosciences SEQUEL II (HiFi), Illumina HiSeq 4000 (RNA-Seq) and Illumina NovaSeq 6000 (10X) instruments.Hi-C data were also generated from muscle tissue of xgPatDepr1 using the Arima2 kit and sequenced on the Illumina NovaSeq 6000 instrument.Genome assembly, curation and evaluationAssembly was carried out with Hifiasm (Cheng et al., 2021) and haplotypic duplication was identified and removed with purge_dups(Guan et al., 2020).One round of polishing was performed by aligning 10X Genomics read data to the assembly with Long Ranger ALIGN, calling variants with Free-Bayes(Garrison & Marth, 2012).The assembly was then scaffolded with Hi-C data(Rao et al., 2014) using SALSA3  (Ghurye et al., 2019).The assembly was checked for contamination and corrected using the gEVAL system(Chow et al.,  2016)  as described previously(Howe et al., 2021).Manual curation was performed using gEVAL, HiGlass(Kerpedjiev  et al., 2018)  andPretext (Harry, 2022).The mitochondrial genome was assembled using MitoHiFi (Uliano-Silva et al., 2023), which runs MitoFinder(Allio et al., 2020) or MITOS(Bernt et al., 2013) and uses these annotations to select the final mitochondrial contig and to ensure the general quality of the sequence.

Yes Are the protocols appropriate and is the work technically sound? Yes Are sufficient details of methods and materials provided to allow replication by others? Yes Are the datasets clearly presented in a useable and accessible format?
Open Res.2023; 8: 418 PubMed Abstract | Publisher Full Text 4. Lawniczak MKN, Darwin Tree of Life Barcoding collective, Wellcome Sanger Institute Tree of Life programme, Wellcome Sanger Institute Scientific Operations: DNA Pipelines collective, et al.: The genome sequence of the blue-rayed limpet, Patella pellucida Linnaeus, 1758.Wellcome Open Res.2022; 7: 126 PubMed Abstract | Publisher Full Text

Is the rationale for creating the dataset(s) clearly described? Yes Are the protocols appropriate and is the work technically sound? Yes Are sufficient details of methods and materials provided to allow replication by others? Yes Are the datasets clearly presented in a useable and accessible format? Yes Competing Interests:
No competing interests were disclosed.